International Journal of Epidemiology — Latest Matching Preprints

1

Breast cancer over-diagnosis due to mammography screening - A long-term follow-up population study of BreastScreen Norway

Heggland, T.; Vatten, L. J.; Opdahl, S.; Weedon-Fekjaer, H.

2026-06-03 epidemiology 10.64898/2026.06.02.26354696 medRxiv

Top 0.1%

38.0%

Show abstract

Objectives Estimates of breast cancer over-diagnosis related to mammography screening varies substantially. Over-diagnosis is commonly defined as cases that would not have been detected during the persons remaining lifetime in the absence of screening. We here aim to quantify over-diagnosis in the population-based BreastScreen Norway mammography screening program using long-term follow-up and more detailed modeling than previous studies. Setting We applied data on Norwegian screening patterns and breast carcinoma incidence for the period 1987-2019, covering women aged 49-84 years, leveraging the gradual implementation of the organized biennial BreastScreen Norway screening program for women aged 50-69 during 1995-2005. Methods Using an extended age-period-cohort model, we estimated excess lifetime risk of invasive breast cancer and ductal carcinoma in situ in the presence of program screening, as an indicator of over-diagnosis among screen-detected cases. Results Lifetime risk of breast carcinomas was 6.6% (95% confidence interval 2.5% to 10.7%) higher for invited than for non-invited women. This indicates that 18% (95% confidence interval 7.3% to 28.0%) of screen-detected cases may be over-diagnosed, and that approximately one in 86 (95% confidence interval 54 to 210) screened women were over-diagnosed during their screening period. Using effect estimates from previous studies, we estimated that approximately three women are over-diagnosed for every breast cancer death prevented by screening, and that 87% of over-diagnosed tumors might grow extremely slowly. Conclusions Over-diagnosis related to mammography screening is a considerable problem, but its extent may be smaller than reported in some previous studies. Most over-diagnosed tumors likely grow very slowly.

2

Early life multidimensional disadvantage of South Australian children: a whole-population linked data study

Kalamkarian, A.; Pilkington, R. M.; Lynch, J.; Mittinty, M. N.; Malvaso, C.; Hawkins, K.; Pharo, H.; Beck, K.; Chittleborough, C. R.

2026-06-05 epidemiology 10.64898/2026.06.03.26354860 medRxiv

Top 0.1%

17.1%

Show abstract

Background: Whole-population linked administrative data platforms provide an opportunity to generate evidence on early life multidimensional disadvantage to inform resourcing and service provision to families with complex needs. Methods: We used individual-level de-identified data from nine administrative data sources included in the Better Evidence Better Outcomes Linked Data (BEBOLD) platform. The population included all children born in South Australia between 2004-2011 (n=143,083), and their parents. We described the prevalence and distribution of multiple disadvantages affecting children from the 12 months before birth to age 5. Eleven domains of parental disadvantage were created: economic, education, access to services, mental health, substance misuse, smoking during pregnancy, domestic and family violence, health, child protection contact, justice system contact, and death. We investigated the concordance of our measure with an area-level socioeconomic measure used in government reporting. Results: One in two children (48%) were exposed to at least one disadvantage domain, and one in seven (14%) were exposed to three or more domains before age five. Economic disadvantage was most prevalent, affecting one in four (27%) children, of which 75% were exposed to additional forms of disadvantage. Substance misuse, domestic and family violence, and justice system contact were the least likely domains to occur in isolation. Only 54.4% who experienced five or more disadvantage domains were classified in the area-level socioeconomic measure's 'most disadvantaged' quintile. Conclusion: Early life exposure to parental disadvantage can be highly multidimensional. Measurement across different systems is important for informing coordinated service provision for families with complex needs.

3

A New Mixed Frequency Regression Model For Environmental Epidemiology

Shukla, N.; Bartington, S. E.; Hansell, A. L.; Lucas, T. C.

2026-06-04 epidemiology 10.64898/2026.06.03.26354801 medRxiv

Top 0.1%

14.8%

Show abstract

Background: In the absence of high-resolution response data, exposure-response modelling often relies on aggregated low-frequency exposure data, leading to loss of high-resolution information. Mixed Data Sampling (MIDAS) from econometrics offers an alternative but is limited due to its inability to make high-resolution predictions, inflexible likelihoods and penalised nonlinear functions, and limited visualization options. We propose a mixed-frequency Distributed Lag Non-linear Model (mf-DLNM) which can eliminate the need to aggregate exposure data in environmental epidemiology and provide high resolution predictions for time series studies. Methods: We evaluated the inference and predictive performance of the mf-DLNM. To evaluate its ability to estimate exposure-response relationships, we applied mf-DLNM and same-frequency (sf)-DLNM using data from the West Midlands, UK. Additionally, we compared the predictive performance of mf-DLNM with sf-DLNM and MIDAS across nine regions of England. As MIDAS cannot predict at the resolution of the predictor (daily), we compared the predictive performance of mf-DLNM and MIDAS at weekly resolution. To test the model's ability to predict high temporal resolution risk (daily), we compared sf-DLNM (with access to daily mortality counts) with mf-DLNM (with access only to weekly mortality counts). Results: In the West Midlands example, mf-DLNM performed comparably to sf-DLNM in estimating daily risk of temperature on respiratory mortality. Furthermore, mf-DLNM and MIDAS exhibited similar performance for weekly predictions. For high-resolution predictions, mf-DLNM and sf-DLNM showed nearly similar performance, despite mf-DLNM having access only to low-resolution response data. Conclusion: This mixed-frequency approach in environmental epidemiology overcomes the limitations of predicting health risks using aggregated exposure data and provides estimates of high-resolution outcomes in the absence of high-frequency health outcome datasets.

4

Using human genetic variation to estimate the effect of lipoprotein(a) lowering on pregnancy outcomes

Urquijo, H.; Goldfine, A. B.; Casas, J. P.; Xu, H.; Timsit, Y. E.; Mendelson, M. M.; Hache, C.; Jones, I.; Arustamian, D.; Magnus, M. C.; Gaunt, T. R.; Lawlor, D. A.; Borges, M. C.

2026-05-20 epidemiology 10.64898/2026.05.18.26351595 medRxiv

Top 0.1%

14.5%

Show abstract

Background: Lipoprotein(a) (Lp[a]) is a genetically determined causal and independent cardiovascular risk factor and Lp(a) targeted therapies are being developed. However, evidence on the safety of substantial Lp(a) lowering during pregnancy is limited. We evaluated the impact of Lp(a) lowering on adverse pregnancy and perinatal outcomes (APPOs) using human genetic evidence. Material and Methods: We applied a drug-target Mendelian randomization (MR) approach using genetic variants associated with Lp(a) in the UK Biobank at the LPA locus to proxy pharmacological Lp(a) lowering. Summary-level APPO data were obtained from the MR-PREG collaboration, comprising up to 714,899 women across multiple studies. Twenty APPOs were included. Sensitivity analyses included adjustment for fetal genotype, alternative Lp(a) datasets, leave-one-study-out analyses, and exploration of Lp(a) genetic scores and individuals homozygous for LPA loss-of-function variants in the UK Biobank. Results: Across 20 APPOs, MR estimates showed no strong evidence of causal effects, with no associations surviving false discovery rate P-value correction. Most estimates were close to null, including gestational hypertension, gestational diabetes, preeclampsia, miscarriage and neonatal intensive care unit admission. Some associations were slightly larger in magnitude but with wide confidence intervals: gestational age (mean difference 0.04 weeks, 95% CI 0.02-0.06 per 210nmol/L reduction in Lp[a]) and congenital malformation (OR 0.82, 95% CI: 0.72-0.94) in the protective direction of effect, and higher odds of stillbirth (OR 1.09, 95% CI: 1.00-1.19) and low Apgar at 1 minute (OR 1.11, 95% CI: 0.99-1.24). Sensitivity analyses consistently supported the primary findings, with no evidence of increased maternal nor offspring risk in analyses adjusting for maternal-fetal genotype, across alternative exposure datasets, or in leave-one-study-out tests. Individual-level analyses of Lp(a) genetic score and LPA loss-of-function variants showed no associations, although power was limited. Conclusion: These findings suggest that substantial lowering of Lp(a) is unlikely to increase APPO risk, although modest effects, particularly for rare outcomes, cannot be excluded.

5

Global variation in cardiometabolic risk structures: A 48-country comparative Bayesian network analysis in 146,000 participants using WHO STEPS data

Babagoli, M. A.; Beller, M. J.; Scutari, M.; Gonzalez-Rivas, J. P.; Noronha, J. C.; Medicine, A.; Sulbaran, N.; Cabrera, S. S.; Fallahzadeh, A.; Iruvanti, S.; Nieto-Martinez, R.; Mechanick, J. I.

2026-05-20 public and global health 10.64898/2026.05.15.26353288 medRxiv

Top 0.1%

14.2%

Show abstract

Background Cardiometabolic-based chronic disease (CMBCD) at an individual level results from complex interactions among a multi-tiered network of sociodemographic, behavioral, and metabolic factors. Though a consensus set of risk factors drives CMBCD, population context influences risk factor effects and interactions. To better understand this phenomenon, we investigated the multi-tiered networking of cardiometabolic variables across diverse populations using a comparative modelling approach. Methods and Findings Utilizing nationally representative cross-sectional data from 48 countries participating in the World Health Organization "STEPwise approach to noncommunicable disease risk factor surveillance" survey, we learned country-specific Bayesian networks including sociodemographic, behavioral, and cardiometabolic variables (adiposity, diabetes, hypertension, hyperlipidemia, and cardiovascular disease). By computing the structural Hamming distance between pairs of networks, we compared differences in network structures across regions and country income levels. We then used the learned networks to assess individual risk factor influences and interactions on cardiometabolic outcomes. Country-specific Bayesian networks varied in terms of the risk factors directly and indirectly associated with the cardiometabolic outcomes. Network structures differed significantly across regions (p = 0.023) but not across income levels (p = 0.91). These results were robust to an alternative learning algorithm, network comparison metric, and data imputation approach. Older age (60+ vs. 30-44 years old) was associated with a greater increase in probability of obesity in Europe and Central Asia (+80%) compared to other regions. Higher education was associated with increased probability of obesity (+53%), diabetes (+18%), and hypertension (+2%) in South Asia but decreased probability of obesity (-10%), diabetes (-32%), hypertension (-16%), and hyperlipidemia (-25%) in Middle East and North Africa. The interaction between age and sex in predicting obesity was significant in the highest proportion of countries in Europe and Central Asia compared to other regions. While this dataset provided standardized data across multiple countries to define cardiometabolic risk factors and drivers, there was limited data on certain health outcomes and uneven availability of data across regions. Conclusions These results revealed specific regional patterns of multi-tiered cardiometabolic risk structures, emphasizing the need for regionally tailored public health strategies rather than applying generalized consensus evidence-based models. Future research should explore the structural drivers of regional differences in inter-relationships of cardiometabolic risk factors, drivers, and disease.

6

Neonatal mortality risk of large-for-gestational age and macrosomic live births in low- and middle-income subnational birth cohorts: An individual participant meta-analysis (2000-2017)

Kirakoya Samadoulougou, F.; Barche, B.; Ukwishaka, J.; Subedi, S.; Erchick, D. J.; Suarez Idueta, L.; Hamer, D. H.; Semrau, K. E. A.; Hamomba, F. M.; Banda, B.; Manasyan, A.; Pry, J. M.; Maleta, K.; Ashorn, U.; Schmiegelow, C.; Hjort, L.; Minja, D. T. R.; Lusingu, J. P. A.; Freitas da Silveira, M.; Buffarini, R.; Baqui, A. H.; Khanam, R.; Ahmed, S.; Zhu, Z.; Zeng, L.; Cheng, Y.; Lachat, C.; Roberfroid, D.; Huybregts, L.; Toe, L. C.; Tielsch, J. M.; Khatry, S. K.; Mullany, L. C.; Ohuma, E. O.; Blencowe, H.; Katz, J.; Lee, A. C. C.; Black, R. E.; Hazel, E. A.

2026-06-06 public and global health 10.64898/2026.06.03.26354851 medRxiv

Top 0.1%

13.9%

Show abstract

Background Large-for-gestational-age (LGA) and macrosomic newborns are at increased risk of adverse perinatal outcomes, including death, yet the burden of neonatal mortality associated with these conditions in low- and middle-income countries (LMICs), where ongoing nutritional and epidemiological transitions suggest their prevalence will rise, remains poorly quantified. In this study, we quantify the neonatal mortality risk associated with LGA and macrosomia from 16 subnational birth cohorts in low- and middle-income countries between 2000 and 2017. Methods and findings This is an individual-participant meta-analysis to estimate neonatal mortality rates (NMRs) and relative risks among LGA infants (>90th and >97th percentile birth weight-for-gestational-age using INTERGROWTH-21st) versus appropriate-for-gestational-age (AGA, 10th-90th percentile) infants. Macrosomic ([≥]4000 g and [≥]4500 g) neonates were compared with those weighing 2500 g-3999g. Missing birth weights were imputed using recalibration and multiple imputation methods. We used random effects meta-analysis to pool relative risks. Median prevalences of LGA >90th and >97th percentile were 5.3% (interquartile range 3.6-8.2) and 2.6% (IQR 1.3-4.5), respectively; macrosomia ([≥]4000 g and [≥]4500 g) prevalences were 1.0% (IQR 0.3-3.1) and 0.06% (IQR 0.0, 0.30), respectively. Mortality was highest among preterm plus LGA infants (61.3 per 1000). LGA infants in the >90th percentile had over twofold increased mortality compared with appropriate-for-gestational-age infants (RR: 2.46; 95% CI: 1.86-3.25), while >97th percentile infants had a higher risk (RR: 3.77; 95% CI: 2.50-5.69). Term LGA >97th percentile infants also showed elevated mortality (RR: 3.14; 95% CI: 1.58-6.22). For LGA >97th percentile, the risk was higher in the early neonatal period (RR: 2.71; 95% CI: 1.92-3.82) than late (RR: 1.69; 95% CI: 1.22-2.34). There was no overall association between macrosomia ([≥]4000 g) and neonatal mortality. Population attributable fractions were 7.2% for LGA >90th percentile and 0.4% for macrosomia ([≥]4000 g). Conclusions Neonatal mortality risks were elevated among LGA infants in low- and middle-income countries, particularly at extreme values (>97th percentile) and during the early neonatal period. Macrosomia showed weaker, less robust associations. Although LGA prevalence is currently low ([~]5%) and contributes less to neonatal mortality than small newborns, ongoing nutritional and epidemiological transitions suggest increasing prevalence. This highlights the need for strengthened surveillance, monitoring, and improved delivery planning to ensure that no population is left behind.

7

Incremental Clinical Value of Single-Molecule Nanopore Sequencing in Thalassemia Testing: A Prospective Double-blind, Multicenter Study

Xiang, J.; Zhu, B.; Xu, H.; Chen, Y.; Sun, X.; xiang, r.; Zhao, Y.; Liu, W.; Zhang, L.; He, J.; liu, j.; Chen, Y.; Fan, Z.; Zhang, H.; Tan, J.; Pang, L.; Shi, L.; Kong, Y.; Cai, A.

2026-06-09 hematology 10.64898/2026.06.09.26354559 medRxiv

Top 0.1%

12.7%

Show abstract

Background Thalassemia is one of the most common monogenic disorders worldwide, current screening strategies combining hematological testing with molecular assays still carry a risk of missed diagnoses and undesirable efficiency, particularly for complex structural variants and rare mutations. Methods In this prospective double-blind, multicenter cohort study of 3,842 participants (3,362 pregnant women and 480 male partners), we conducted a head-to-head comparison to systematically evaluate the incremental clinical value and detection performance of single-molecule nanopore sequencing in thalassemia (SMITH) against conventional hematological testing and next-generation sequencing (NGS). Findings The overall concordance rate between NGS and SMITH was 98.6% (3789/3842). The discrepant cases (n=53) were directly attributed to the superior detection capabilities of SMITH, which successfully identified complex structural rearrangements-including 45 -globin gene triplications and four HK alleles-that were missed by NGS. Furthermore, SMITH accurately detected four rare variants (c.134_135insT/, c.-22(C>T)/, {beta}N/{beta}c.316-290delinsAGGGCAATAATTT and {beta}3.5 kb deletion/{beta}N ) and resolved ten trans and three cis configurations within the globin gene allele. Clinically, these technical advantages translated to a 9.3% (5/54) increase in the detection rate of high-risk prenatal couples, effectively preventing one birth affected by moderate-to-severe thalassemia. Additionally, SMITH corrected a diagnostic discrepancy in one case (HK vs. -3.7), sparing the couple from an unnecessary invasive procedure. Interpretation Our findings demonstrate that SMITH provides a powerful platform for resolving globin gene rearrangements, detecting rare variants, and enabling direct haplotype phasing. By effectively eliminating diagnostic blind spots, SMITH is expected to become an optimal method for thalassemia prevention programs. Funding This study was supported by Chinese National Natural Science Foundation Projects 81760037 and 82271894.

8

Development of Longitudinal, Linked Maternal-Infant Cohorts using the Epic Cosmos Electronic Health Record Dataset

Leonard, S. A.; Dysart, K.; Callahan, A.; Siadat, S.; Zhang, J.; Handley, S. C.; Huybrechts, K. F.; Igbinosa, I.; Bateman, B. T.

2026-06-04 epidemiology 10.64898/2026.06.02.26354757 medRxiv

Top 0.1%

10.7%

Show abstract

Background: Epic Cosmos is a relatively new centralized electronic health record dataset with high potential utility in perinatal epidemiologic research. Objectives: The study objectives were to develop replicable steps to create longitudinal, linked maternal-infant cohorts in Cosmos, assess completeness of key variables, evaluate potential selection bias with restrictions for longitudinal healthcare encounters, and provide an example epidemiologic analysis. Methods: We created maternal-infant cohorts by starting with live births during 2023-2024 recorded in the BirthFact data table and joining with additional data tables as needed. We selected and created variables for perinatal characteristics, common comorbidities, and routinely measured vital signs and laboratory values, and assessed variable completeness. We sequentially restricted the birth cohort for maternal-infant linkage and longitudinal healthcare from first-trimester prenatal care encounter through infant follow-up care within 12 weeks post-discharge from birth hospitalization. Finally, we conducted an example analysis of the association between high systolic blood pressure in the first trimester ([≥]140 mm Hg) and later onset of preeclampsia among those with chronic hypertension. Results: The total linked birth cohort included 2,624,186 pregnancies. Completeness was >90% for most variables assessed but was 77% for racial and ethnic group and 76% for body mass index at delivery. Characteristics of the cohort were similar to those reported for the entire United States birth population based on birth certificate data, including similar regional and racial-ethnic composition. Longitudinal cohort restriction requiring linked records from first trimester prenatal care through infant follow-up care reduced the cohort size to 509,148 pregnancies. However, restriction had minimal effects on cohort characteristics. In the example analysis, high systolic blood pressure was associated with increased risk of preeclampsia among those with chronic hypertension (aRR: 1.26; 95% CI: 1.22, 1.30). Conclusions: This study provides a rigorous and reproducible approach to creating longitudinal, linked maternal-infant cohorts in Epic Cosmos and the analytical findings suggest high data quality and representativeness.

9

Integrative Genomic Analyses Identify COL21A1 and ENPEP-FGF5 Regulatory Pathways for Blood Pressure Variation in East Asians

LAU, Z. C.; Chang, X.; Sim, K. S.; Wu, H.; Naaz, A.; Muniasamy, U.; Khor, C.-C.; Koh, W.-P.; Vitaly, S.; Dorajoo, R.

2026-05-18 genetics 10.64898/2026.05.14.725285 medRxiv

Top 0.1%

10.2%

Show abstract

BackgroundHypertension is a highly heritable cardiovascular disorder and a major determinant of cardiometabolic disease, including diabetes. However, the regulatory genes and tissue-specific mechanisms underlying blood pressure variations remain incompletely understood. MethodsLeveraging a well-characterized prospective population-based cohort comprised of 27,308 participants from the Singapore Chinese Health Study (SCHS), we evaluated genome-wide genetic associations for five blood pressure traits: hypertension status, systolic blood pressure, diastolic blood pressure, mean arterial pressure (MAP) and pulse pressure (PP). Additionally, we conducted a transcriptome-wide association study (TWAS), integrating gene expression data from 49 tissues, followed by colocalization and fine-mapping to prioritize regulatory genes. Association of identified variants with incident diabetes was additionally evaluated in the longitudinal data. ResultsWe validated 10 blood pressure loci (P between 1.64 x 10-20 - 4.10 x 10-8) and identified an East-Asian specific splice donor variant at the COL21A1 gene associated with PP (rs149344559, P = 6.78 x 10-10). Integrative analyses prioritized FGF5 in kidney cortex and ENPEP in pituitary tissue as candidate regulatory genes. The blood pressure-lowering allele at ENPEP (T allele, rs1879056) was associated with reduced risk of incident diabetes. Mediation analysis demonstrated that approximately 21% of the genetic association with diabetes was mediated through MAP (Pindirect-effect = 2 x 10-16). ConclusionThis study refines genetic predispositions for blood pressure among East-Asians. We further delineate tissue-specific regulatory pathways underlying blood pressure variations and identify ENPEP-mediated dysfunctions linking blood pressure genetics to diabetes risk, underscoring integrated disease mechanisms.

10

Bias from small-count suppression in county-level cancer disparity estimates: a calibrated simulation study

gahan, k.

2026-06-08 epidemiology 10.64898/2026.06.05.26355021 medRxiv

Top 0.1%

10.1%

Show abstract

Abstract Background. Area-level cancer disparities are routinely estimated from public county data in which rates based on small counts (fewer than 16 cases or deaths) are suppressed. Analysts typically drop suppressed counties (complete-case analysis). Because suppression depends on case counts tied to population size and demographic composition, this missingness may be informative, but its effect on the disparity estimate has not, to our knowledge, been quantified. Methods. In a cross-sectional ecological study of 3,143 U.S. counties (analytic sample 3,018 with computable exposure) using one frozen public release of NCI State Cancer Profiles incidence and mortality data and ACS 2018-2022 5-year data, we estimated the most- versus least-deprived ICE(race+income) quintile rate ratio (RR) and rate difference for female breast, stomach, and cervix cancers under four suppression-handling methods: complete-case, available-case, bounding, and model-based small-area estimation. We characterized which counties were erased, and, following the ADEMP framework, ran a Monte Carlo simulation (1,000 replicates per cell; Monte Carlo standard error of bias approximately 0.0025) calibrated to the release to measure bias against a known truth. Analyses were pre-registered. Results. The suppressed fraction rose with rarity: 7.4% of counties for breast, 61.3% for stomach, and 75.7% for cervix incidence. Suppression was concentrated in the most-deprived quintile (cervix, 81.8% suppressed vs 63.8% least-deprived) and overwhelmingly removed rural rather than minority residents (cervix: 81% of the rural but 9% of the minority population erased). For breast (little suppression) the RR was 0.87 (95% CI 0.85-0.89) and identical across methods; for cervix incidence the complete-case RR (1.56) exceeded the model-based estimate (1.50), and for cervix mortality (91% suppressed) complete-case (1.86) exceeded model-based (1.56) by 16% with a wide bounding interval (1.88-2.62). In calibrated simulation, population-weighted complete-case bias was small (less than 2%) at the observed deprivation-county-size correlation and grew with rarity, threshold, and unweighted aggregation; its direction was conditional, becoming positive (over-estimation) as deprived counties became smaller. Conclusions. Complete-case handling of suppressed counties over-estimates rare-cancer area disparities relative to methods that retain them, while silently erasing most of the rural and most-deprived communities the estimate is meant to represent. The effect is negligible for common cancers and grows with rarity. Public-data disparity analyses should report the suppressed fraction and use bounded or model-based estimates by default. Keywords: cancer disparities; small-count suppression; Index of Concentration at the Extremes; informative missingness; small-area estimation; rural health.

11

Acceptability of community-based maternal and newborn care in South Sudan: A qualitative study using the Theoretical Framework of Acceptability

Luka, L. A.; Macharia, T.; Kimemia, G.; Nanda, G.; Ayom, A. A.; Deng, A.; Kuol, J. M. D.; Jama, M.; Nyuany, L. M.; Caroline, I.; Noor, K.; Kozuki, N.

2026-05-18 sexual and reproductive health 10.64898/2026.05.14.26353170 medRxiv

Top 0.1%

10.1%

Show abstract

South Sudan faces among the highest maternal and newborn mortality rates globally, with approximately 87% of deliveries occurring at home without skilled birth attendance. In 2024, the International Rescue Committee launched a Community-Based Maternal and Newborn Care (CBMNC) program in Aweil East County, Northern Bahr El Ghazal, deploying trained Boma Health Workers (BHWs) to deliver essential maternal and newborn health services at the household level. This study explored the acceptability of the CBMNC model among diverse stakeholders. This qualitative descriptive study was grounded in the Theoretical Framework of Acceptability (TFA). Data were collected between May and July 2025 through 17 focus group discussions (FGDs), 14 in-depth interviews (IDIs), and 10 key informant interviews (KIIs) with 185 participants, including program recipients, male partners, mothers and mothers-in-law, Boma and Hospital Health Committee (BHC/HHC) members, BHWs, supervisors, and health system stakeholders at state and national levels. Framework analysis, combining deductive coding based on the seven TFA constructs with inductive thematic analysis, was used. CBMNC was well accepted by recipients and their families, despite provider and health system concerns about sustainability. Trust in community-selected BHWs made home-based care valuable, especially given limited facility access. Intervention coherence relied on pictorial aids, repeated visits, and peer learning to address low literacy. Participants perceived commodity interventions like misoprostol and chlorhexidine as impactful, while behavioral counseling was less recognized. Clients faced minimal burden, but providers experienced significant challenges and inadequate compensation. Health stakeholders were cautiously optimistic but questioned lay provider capacity and long-term viability in a fragile environment. CBMNC can achieve high community acceptability when delivered through trusted, community-selected health workers using contextually appropriate strategies. However, community acceptability alone is insufficient for sustainable scale-up. Addressing provider compensation, workload, and structural integration into national health systems is essential to ensure that gains in acceptability translate into sustained service delivery.

12

Disentangling infectiousness and susceptibility by age group using transmission pair data: a study of SARS-CoV-2 household transmission

Leung, K. Y.; Miura, F.; Backer, J. A.

2026-06-05 epidemiology 10.64898/2026.06.04.26354892 medRxiv

Top 0.2%

8.7%

Show abstract

Background Differential contributions to transmission across age groups have been reported for many respiratory infections, including SARS-CoV-2. They are crucial for estimating the impact of age-specific interventions. Disentangling these age-dependent contributions remains challenging, as they may reflect differences in contact rates, biological susceptibility, or infectiousness. Aim We aim to jointly estimate age-specific per-contact infectiousness and susceptibility and their effect on the impact of age-specific interventions. Methods The age-specific infectiousness and susceptibility were jointly estimated in a Bayesian framework by combining contact data with transmission pair data (who-infected-whom). We applied this approach to 197,840 self-reported household transmission pairs collected in the Netherlands during the COVID-19 pandemic. Using these estimates, we projected the expected impact of school closure and work-from-home measures during the early stages of an epidemic in the absence of other interventions. Results Both infectiousness and susceptibility to SARS-CoV-2 infection were lowest in children aged 0-9 years and highest in adults over 30 years old, with 2- to 4.5-fold differences between these groups. Projected impacts of age-specific interventions indicated that school closures would reduce the reproduction number by 8% or 29% when age-specific susceptibility and infectiousness were or were not considered, respectively. Conversely, working-from-home policies would lead to reductions of 41% with and 20% without age-specific infectiousness and susceptibility. Conclusion Our method enables robust estimation of age-specific infectiousness and susceptibility. Accounting for these age heterogeneities is essential for projecting the impact of age-targeted interventions. Our approach is adaptable to other respiratory infections and can guide more tailored public health responses.

13

Who infected the reported cases? Evidence from 678,482 COVID-19 cases with identified infector collected in routine surveillance in the Netherlands, 2020-2022.

Backer, J. A.; Leung, K. Y.; Andeweg, S. P.; Van de Kassteele, J.; Veldhuijzen, I.; Hahne, S.; Wallinga, J.

2026-05-17 epidemiology 10.64898/2026.05.15.26347859 medRxiv

Top 0.2%

8.6%

Show abstract

Background During infectious disease outbreaks, characteristics of reported cases are routinely collected. These give information on becoming infected but not on infecting others. We assess whether linking infectees to infectors, together with their characteristics, can help understand transmission. Methods From the start of the COVID-19 pandemic in the Netherlands, reported cases were asked to identify their most probable infector in routine surveillance, enabling the linking of cases. We assess for the period 27 February 2020 - 11 April 2022 whether the infectees of these transmission pairs are representative of all reported cases, whether the transmission pairs yield verifiable estimates of epidemiological characteristics (here the serial interval), and whether they provide information on transmission that cannot be obtained otherwise. Results Of 8,003,008 reported cases, 678,482 (8.5%) could be linked to their most probable infector. These infectees were largely representative of the reported cases regarding age group, sex, and geographical location. The mean serial interval of 3.6 days (sd 3.4 days) from transmission pairs aligns with literature. Transmissions between age groups largely follow known contact patterns. Most transmissions in September 2021 occurred between persons who were not (fully) vaccinated, indicating the effectiveness of the vaccine, and relatively few between persons with different vaccination status, indicating assortative mixing in vaccination status. Conclusion Transmission pairs can be efficiently collected in routine surveillance, providing insight into disease transmission. The current post-pandemic period provides an excellent opportunity to adjust reporting systems for linking infectees to their most probable infector as preparation for future outbreaks.

14

Living Environments and Mental Health: the environMAP database

Renner, P.; Polemiti, E.; Jentsch, M.; Banks, J. R.; Cleff, D.; Siehl, S.; Dallavalle, M.; Lett, T.; Buck, C.; Castell, S.; Frost, J.; Grabe, H.; Keil, T.; Harth, V.; Kettlitz, R.; Krist, L.; Leitzmann, M.; Mikolajczyk, R.; Naaouf, N.; Obi, N.; Peters, A.; Schneider, A.; Wolf, K.; Nees, F.; Twardziok, S. O.; Marquand, A.; Hese, S.; Schepanski, K.; Schumann, G.; environMENTAL consortium,

2026-05-20 occupational and environmental health 10.64898/2026.05.15.26353275 medRxiv

Top 0.2%

8.5%

Show abstract

Environmental exposures are increasingly examined in relation to mental health, yet large-scale epidemiological analyses remain constrained by fragmented geospatial data, heterogeneous spatial and temporal resolutions, and privacy-preserving linkage requirements, limiting systematic investigation of multiple environmental domains at the population level. We present environMAP, a harmonised set of analysis-ready environmental exposure layers derived from open, global sources. environMAP spans the built environment, green and blue spaces, light exposure (solar radiation and night-time light), terrain, weather and extremes, and air pollution. We document data provenance, spatial buffers, preprocessing, projection alignment, and metadata, and provide a reproducible workflow for privacy-preserving linkage to cohort residential locations. To demonstrate utility, we linked environMAP to >200,000 adults in the German National Cohort (NAKO) and summarised self-reported lifetime doctor-diagnosed depression across exposure gradients using sex-stratified descriptive analyses. Gradients were interpretable and broadly consistent with prior evidence, supporting feasibility, scalability, and hypothesis generation. The framework is adaptable to other outcomes, cohorts, and regions.

15

Diverging Pre-Pandemic Mortality Trends: Age-Specific and Cause-Specific Patterns Across High-Income Countries

Perez-Reche, F.; Summers, J.; Jones, G. T.; Macfarlane, G. J.

2026-06-03 public and global health 10.64898/2026.06.01.26354619 medRxiv

Top 0.2%

8.5%

Show abstract

Background: Mortality rates have declined across most high-income countries for decades, but recent evidence suggests a slowdown in improvements or a shift to increasing mortality, particularly among working-age populations. The international distribution and drivers of these trends remain incompletely understood. Methods: Mortality trends during 2012-2019 were analysed using all-cause and cause-specific data from 30 countries. Trends were estimated via linear regression. K-means clustering with Dynamic Time Warping identified countries and ICD-10 chapters with similar temporal trajectories. Results: Trends varied substantially by nation. While Japan, Switzerland, and the Republic of Korea maintained consistent declines in all-cause mortality rates, increases were concentrated in the United States, Canada, and the United Kingdom, most prominently in persons aged 30-59 years. However, cause-specific analysis showed that rising mortality was not confined to these countries: most countries exhibited increases in at least one ICD-10 chapter, with several European countries showing increases across multiple chapters. Across countries, a small set of causes recurred among increasing trends, including external causes (self-harm, drug poisoning) at younger ages and chronic conditions (cardiovascular and liver diseases, specific cancers) in mid-life. Notably, ill-defined causes of death consistently appeared among the increasing causes across countries and age groups. Conclusions: Mortality increases in the 2010s were geographically more widespread than previously recognized. The recurrent rise in mortality from ill-defined causes suggests that an important component of mortality change remains poorly characterized. These findings indicate that stalled health progress is a systemic challenge across many high-income societies.

16

A wealth index based on two-component polychoric principal component analysis reduces urban bias and improves socioeconomic classification in low- and middle-income country surveys: a validation study using LSMS surveys

Vidaletti, L. P.; Dos Santos, A. M.; Hellwig, F.; Barros, A. J. D.

2026-06-08 epidemiology 10.64898/2026.06.01.26354245 medRxiv

Top 0.2%

8.4%

Show abstract

Background: The traditional wealth index, based on principal component analysis (PCA), used in the Demographic and Health Surveys (DHS) and Multiple Indicator Cluster Surveys (MICS), suffers from urban bias, distorting estimates of health inequality. We compared the traditional index (PEAR1) with an alternative two-component polychoric PCA index (POLY2) using annual expenditure from 12 LSMS surveys as the gold standard to determine which provides more accurate SEP measures for equitable policy targeting. Methods: We compared the traditional wealth index (PEAR1) with a two-component polychoric PCA approach (POLY2) using 12 LSMS (Living Standards Measurement Study) surveys (2015-2022) from 12 African countries. Annual household consumption expenditure was the gold standard. We assessed agreement using weighted Cohen's kappa and validated against education (proportion of households with secondary or higher education) using the concentration index (CIX) and slope index of inequality (SII). Results: The POLY2 index showed higher agreement with expenditure quintiles (average national weighted kappa = 43.3%) than the PEAR1 index (35.1%), with notable improvements in urban (43.5% vs. 27.5%) and rural (35.3% vs. 22.4%) areas. POLY2 also attenuated extreme household distributions observed in PEAR1. Education validation showed that POLY2 produced intermediate inequality gradients between the flatter expenditure-based gradient and the steeper PEAR1-based gradient. Conclusion: The POLY2 wealth index is superior to the traditional index, reducing urban-rural bias and providing more accurate socioeconomic classifications. Its adoption in large-scale surveys such as DHS and MICS is recommended to improve equitable monitoring of health inequalities in low- and middle-income countries.

17

Mechanism Matters: A Monte Carlo Evaluation of Estimator Validity and Collider Bias in Environmental Mixture Epidemiology

Obeng-Gyasi, E.

2026-05-26 epidemiology 10.64898/2026.05.25.26354044 medRxiv

Top 0.2%

7.3%

Show abstract

Background: Mixture epidemiology deploys sophisticated estimators, Bayesian kernel machine regression with causal mediation analysis (BKMR-CMA), quantile G-computation (QGC), and parametric G-computation, alongside conventional regression. Comparative evaluations have assumed additive, non-mediated data-generating processes, leaving conditions under which estimator choice determines causal validity uncharacterized. Methods: We developed a simulation framework using military-relevant exposure distributions (metals, per- and polyfluoroalkyl substances [PFAS], polychlorinated biphenyls [PCBs]) and allostatic load (AL) across three deployment tiers, with parameters drawn from military occupational health and contamination literature. Four data-generating processes were specified as directed acyclic graphs: direct effects with confounding (M1), full mediation through AL (M2), synergistic AL-exposure interaction (M3), and collider structure (M4). We evaluated ordinary least squares (OLS), QGC, G-computation, and BKMR-CMA on bias, root mean squared error, and 95% confidence interval coverage across 500 Monte Carlo replications at n = 500 and n = 1,000. Results: No estimator dominated across all mechanisms. Under M1, OLS and G-computation produced near-identical modest positive bias; BKMR-CMA achieved lower root mean squared error through kernel shrinkage. Under M2, BKMR-CMA exhibited severe positive bias for AL (mean bias = +0.579 SD units; coverage = 32.8%). Under M3, BKMR-CMA was the only estimator achieving nominal 95% coverage for AL (95.2%), while regression-based approaches fell to 83.6%. Under M4, G-computation produced persistent bias and near-zero coverage for lead, reflecting structural non-identification. Conclusions: Estimator validity is fundamentally mechanism-dependent. Researchers should base estimator choice on explicit causal assumptions about whether AL functions as confounder, mediator, moderator, or collider, particularly in military and occupational cohorts. We provide a mechanism-to-estimator mapping for applied researchers.

18

Universal Periodic Review recommendations and trajectories of maternal health between 2005 and 2023: a longitudinal ecological analysis of 89 countries

Uppal, A.; Thomas, R.; De Pasquale, M.; Sillo, J.; Getahun, H.

2026-06-05 public and global health 10.64898/2026.06.03.26354800 medRxiv

Top 0.2%

7.3%

Show abstract

Background: The Universal Periodic Review (UPR) is a peer-review mechanism established to hold UN Member States accountable for human rights including the right to health, yet evidence on its impact on health outcomes is limited. We evaluated whether UPR engagement is associated with accelerated improvements in maternal health trajectories. Methods and Findings: We conducted a longitudinal ecological analysis of 89 countries with a baseline maternal mortality ratio (MMR) of 70 or greater per 100,000 live births in 2005. Outcomes were trajectories of annual MMR, skilled birth attendance (SBA), and contraceptive prevalence rate (CPR), from 2005 to 2023. The exposure was the volume of health-related UPR recommendations received across three cycles, thematically classified using a validated rule-based algorithm. Mixed-effects models adjusted for time-varying GDP per capita and historical fragility. The 89 countries received 41,733 UPR recommendations across three cycles, of which 405 (1%) were related to maternal health. Maternal health recommendations were preferentially directed at countries with higher baseline MMR and lower SBA. After adjustment, each additional maternal health recommendation was associated with a 0.24% [95% confidence interval (CI): 0.08, 0.40] faster annual reduction in MMR, a 0.52% [0.12, 0.91] faster annual gain in the odds of SBA, and a 0.21% [0.09, 0.34] faster annual gain in the odds of CPR. Broader recommendations on women's health and health systems and services were also associated with faster annual improvements in trajectories across all three outcomes; recommendations on abortion, family planning, sexual health and wellbeing, and sexual education tended to be directed towards lower-burden countries and were not associated with differences in any trajectories. It is important to note that the ecological design precludes causal inference. Conclusions: Receiving UPR recommendations on the themes of maternal health, womens health, and health systems and services are associated with accelerated improvements in maternal health trajectories among high-burden countries. These findings suggest that international human rights accountability mechanisms may have a role in supporting national progress on maternal health.

19

Integrating enriched case data from national laboratory testing with population-based case-control analyses: a novel statistical likelihood-ratio methodology for PS4 applied to 325,345 breast cancer cases and 671,006 controls

Allen, S.; Rowlands, C. F.; Garrett, A.; Couch, F.; Richardson, M. E.; Pesaran, T.; Pethick, J.; Lavelle, K.; McRonald, F.; Vernon, S.; Torr, B.; Loong, L.; Aungraheeta, R.; Durkie, M.; Burghel, G. J.; Callaway, A.; Robinson, R.; Field, J.; Frugtniet, B.; Palmer-Smith, S.; Grant, J.; Pagan, J.; McDevitt, T.; Snape, K.; Hanson, H.; McVeigh, T.; Loveday, C.; Jones, M.; Hardy, S.; Turnbull, C.; CanVIG-UK,

2026-05-17 genetic and genomic medicine 10.64898/2026.05.13.26353095 medRxiv

Top 0.2%

7.0%

Show abstract

Background: For many evidence criteria within v3.0 of the ACMG/AMP guidelines, methodologies have been developed to empower their use outside the stipulated evidence strengths. However, no such methodology has been established for case-control data (PS4). With the release of large-scale unselected case-control datasets and expansion of nationally-collected laboratory datasets enriched for pathogenic variant carriers, there is potential to combine datasets across ascertainment contexts in a more quantitative manner using novel likelihood ratio tools. Methods: Using our published PS4-LR-Calculator, we calculated a combined log likelihood ratio (PS4-LLR) across five datasets (three unselected, and two enriched), and estimated enrichment of pathogenic variants in clinically-ascertained laboratory data using truncating variant prevalence. Results: Data were combined for 10,817 missense variants from 325,345 female breast cancer patients and 671,006 controls of Western European ancestry for five breast cancer susceptibility genes (BRCA1, BRCA2, PALB2, ATM, CHEK2). A combined LLR was produced for 4,690 missense variants; 927 variants received evidence towards pathogenicity (LLR[≥]1), and 3,242 received evidence towards benignity (LLR[≤]-1). Conclusion: This flexible, variant-level methodology combines nationally-collected 'enriched' datasets with unselected case-control cohorts, expanding the available information for case-control analysis, boosting power, enabling exploration of atypical penetrance and empowering variant classification.

20

Pandemic-related changes in postpartum depression and anxiety among breastfeeding mothers: a systematic review and meta-analysis

Yu, J.; McCann, M.; Clesham, M.; Fewtrell, M.

2026-05-20 epidemiology 10.64898/2026.05.18.26353483 medRxiv

Top 0.2%

7.0%

Show abstract

Background: The COVID-19 pandemic caused major disruptions to maternity care, breastfeeding support, and social networks. These changes may have increased the risk of postpartum depression, anxiety, and stress among breastfeeding mothers, a population that has been underrepresented in previous reviews. This systematic review and meta-analysis aimed to compare maternal mental health outcomes among breastfeeding mothers before and during the COVID-19 pandemic. Methods: We searched MEDLINE, EMBASE, AMED, Web of Science, WanFang Data, MedRxiv, WHO COVID-19 databases, and grey literature from database inception to December 2023. Eligible studies compared mental health outcomes in breastfeeding mothers before and during the COVID-19 pandemic using validated assessment tools, including the Edinburgh Postnatal Depression Scale (EPDS), Generalized Anxiety Disorder Scale (GAD-7), State-Trait Anxiety Inventory (STAI), or Perceived Stress Scale (PSS). Studies with fewer than 10 participants per group were excluded. Two reviewers independently screened studies, extracted data, and assessed risk of bias using the Joanna Briggs Institute checklist or Newcastle-Ottawa Scale, depending on study design. Random-effects meta-analysis was performed when at least two studies reported comparable outcomes. Results: Twenty-three studies involving breastfeeding mothers from 15 countries were included. Meta-analysis showed significantly higher depressive symptoms during the pandemic compared with the pre-pandemic period, measured by EPDS (standardized mean difference [SMD] = 0.21, 95% confidence interval [CI] 0.14 to 0.29). Maternal anxiety measured by GAD-7 was also significantly higher during the pandemic (SMD = 0.27, 95% CI 0.13 to 0.41). Findings for perceived stress were mixed across studies and could not be pooled because of heterogeneity in reporting methods. Limited evidence suggested that mother-infant bonding did not substantially decline during the pandemic despite increased maternal psychological distress. Conclusions: Breastfeeding mothers experienced increased postpartum depression and anxiety symptoms during the COVID-19 pandemic. These findings highlight the importance of maintaining breastfeeding support services, ensuring access to maternal mental health screening, and developing flexible models of postpartum care during future public health emergencies. PROSPERO registration: CRD42022354670.